The effect of fundamental frequency on Mandarin speech recognition
نویسندگان
چکیده
We study the effects of modeling tone in Mandarin speech recognition. Including the neutral tone, there are 5 tones in Mandarin and these tones are syllable-level phenomena. A direct acoustic manifestation of tone is the fundamental frequency (f0). We will report on the effect of f0 on the acoustic recognition accuracy of a Mandarin recognizer. In particular, we put f0, its first derivative (f0′), and its second derivative (f0′′) in separate streams of the feature vector. Stream weights are adjusted to investigate the individual effects of f0, f0′, and f0′′ to recognition accuracy. Our results show that incorporating the f0 feature negatively impacted accuracy, whereas f0’ increased accuracy and f0’’ seemed to have no effect.
منابع مشابه
Mandarin-Speaking Children’s Speech Recognition: Developmental Changes in the Influences of Semantic Context and F0 Contours
The goal of this developmental speech perception study was to assess whether and how age group modulated the influences of high-level semantic context and low-level fundamental frequency (F0) contours on the recognition of Mandarin speech by elementary and middle-school-aged children in quiet and interference backgrounds. The results revealed different patterns for semantic and F0 information. ...
متن کاملEffects of Semantic Context and Fundamental Frequency Contours on Mandarin Speech Recognition by Second Language Learners
Speech recognition by second language (L2) learners in optimal and suboptimal conditions has been examined extensively with English as the target language in most previous studies. This study extended existing experimental protocols (Wang et al., 2013) to investigate Mandarin speech recognition by Japanese learners of Mandarin at two different levels (elementary vs. intermediate) of proficiency...
متن کاملMandarin Tones Recognition by Segments of Fundamental Frequency Contours
Mandarin is one of the tonal languages. In Mandarin tones, there are four lexical tones (tone 1 to tone 4) with four different fundamental frequency (f0), such as flat and high, rising, falling and then rising, and falling, respectively. In order to process signal on lexical tone, at first we have to identify which tone is. We would like to find out an efficient approach to identify Mandarin to...
متن کاملThe Relative Weight of Temporal Envelope Cues in Different Frequency Regions for Mandarin Sentence Recognition
Acoustic temporal envelope (E) cues containing speech information are distributed across the frequency spectrum. To investigate the relative weight of E cues in different frequency regions for Mandarin sentence recognition, E information was extracted from 30 contiguous bands across the range of 80-7,562 Hz using Hilbert decomposition and then allocated to five frequency regions. Recognition sc...
متن کاملA study of tones and tempo in continuous Mandarin digit strings and their application in telephone quality speech recognition
Prosodic cues (namely, fundamental frequency, energy and duration) provide important information for speech. For a tonal language such as Chinese, fundamental frequency (F0) plays a critical role in characterizing tone as well, which is an essential phonemic feature. In this paper, we describe our work on duration and tone modeling for telephone-quality continuous Mandarin digits, and the appli...
متن کامل